Fix encapsulated pixeldata handling #103

weanti · 2025-06-26T13:20:33Z

Fix constructing frame offsets for encapsulated pixel data.
Fix reading frames from encapsulated pixel data.

Remove restrictions for bits stored value, because it is not valid in all cases e.g. CT.

… specific e.g. in case of CT it can be 12,13,14,15,16

…ally not read, undefined length was assumed use more intuitive offset calculation: previous offset + tag and length size + current length

jcupitt · 2025-06-30T10:56:48Z

Hi @weanti,

This looks great! Thank you for doing this work.

Could you explain what kinds of DICOM you are working with? Do you have some sample DICOMs I could use for testing?

You probably saw the other PR (#102): we should probably merge that first, since it covers some of the same ground.

weanti · 2025-06-30T11:58:53Z

Hi @weanti,

This looks great! Thank you for doing this work.

Could you explain what kinds of DICOM you are working with? Do you have some sample DICOMs I could use for testing?

You probably saw the other PR (#102): we should probably merge that first, since it covers some of the same ground.

Hi,

thank you for your response.
In the meantime I did further testing and I found problems. Memory overwrites and offset handling problems.
Sure, I try to attach or send some test files. Btw I used JPEG2000 encoded DICOM images.
First I need to polish this PR.

jcupitt · 2025-06-30T15:01:01Z

OK, let's tag this as a draft for now.

fix indexing problem when creating offsets table fix frame size calculation: bits allocated was not considered

weanti · 2025-07-01T19:27:29Z

I think I solved the problems.
I have attached a bunch of DICOM files. See the readme.txt for important properties.
UPDATE: uoloading files doesn't seem to work. Here is a link for a shared zip file: https://drive.google.com/file/d/1UPgq89YLx94XyiuGOupR2t0LtgQ4VnkS/view?usp=sharing

…s not considered

weanti · 2025-08-27T07:04:19Z

@jcupitt What's your opinion about this PR? The other PR (the one before this) is updated. If that is merged, then I'll rebase and update my PR.

jcupitt · 2025-08-27T09:22:18Z

Hi @weanti, sorry, I was on holiday and then got distracted by other projects.

I'll look this over again now.

jcupitt · 2025-08-27T11:48:28Z

... I saw one final tiny issue in #102, when that's resolved I'll look at this more closely.

weanti · 2025-08-27T12:04:34Z

... I saw one final tiny issue in #102, when that's resolved I'll look at this more closely.

Thank you for the feedback.

weanti · 2025-09-30T09:02:51Z

Resolved conflicts.

jcupitt · 2025-09-30T19:43:55Z

I'll read this tomorrow. Thanks for the update!

jcupitt · 2025-10-01T07:47:57Z

I downloaded the zip file, these are useful samples!

What's the licence? Could we add them to the test suite?

weanti · 2025-10-01T09:10:42Z

I downloaded the zip file, these are useful samples!

What's the licence? Could we add them to the test suite?
I downloaded the compsamples_j2k archive (a set of jpeg 2000 compressed images) from ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG04
This includes the SC1_J2KR and VL1_J2KR images.
Another set of images was dwnloaded from https://www.dcmtk.org/download/images/
nema97cd.zip.
This contains im309 in gems/dlx folder.
I think these are public domain.
I try to find the source of the segmentation_j2k.

src/dicom-file.c

src/dicom-parse.c

jcupitt · 2025-10-01T09:56:06Z

Great! Still to do:

add some tests
add a line to the changelog and credit yourself
reformat for libdicom style
some minor restructuring, as noted

… handling, etc

fedorov · 2025-10-02T16:00:51Z

What's the licence? Could we add them to the test suite?

I downloaded the compsamples_j2k archive (a set of jpeg 2000 compressed images) from ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG04
This includes the SC1_J2KR and VL1_J2KR images.
Another set of images was dwnloaded from https://www.dcmtk.org/download/images/
nema97cd.zip.
This contains im309 in gems/dlx folder.
I think these are public domain.
I try to find the source of the segmentation_j2k.

@dclunie can you confirm what is the license for the images available from the NEMA FTP server?

@michaelonken @jriesmeier how about the images shared in https://www.dcmtk.org/download/images/ ?

dclunie · 2025-10-02T18:45:45Z

There is no specified license for the CAR97, NEMA97, or WG04 images - we had intended all of these to be publicly usable without restriction (we gave away the CDs at meetings, and shared the images online by anonymous ftp), but never defined a license.

michaelonken · 2025-10-03T09:59:51Z

David said it all ☝️

[Edit: The folder ddsm/ has its own README. The collection of images stems from the University of South Florida and has originally been provided in non-DICOM format. We converted it (I don't remember the exact context) and offered to host these images on the OFFIS servers (which they were fine with). I found a copy of the now-offline original website in the Internet archive. The images have been part of a research grant; maybe you find more information if you dig through the site.]

weanti · 2025-10-06T12:15:43Z

Great! Still to do:

add some tests

add a line to the changelog and credit yourself

reformat for libdicom style

some minor restructuring, as noted

I'm working on tests. Will take some time due to limited availability.

weanti · 2025-10-09T12:00:17Z

Added some tests. These are rather functional test, because the implementation that handles encapsulated pixel data is not on the public API.
The test data is generated and may not be DICOM conformant e.g. not really loadable by real applications. The tests focus on the "happy path". Shall I add some tests for the error cases as well?

weanti · 2025-11-07T09:57:58Z

@jcupitt Shall we proceed with this PR?

jcupitt · 2025-11-07T15:47:13Z

Sorry @weanti I got distracted again.

I'll do a final review now.

jcupitt

Sorry :( one more thing.

src/dicom-parse.c

jcupitt · 2025-11-08T14:13:45Z

src/dicom-parse.c

-            if (!read_tag(&state, &tag, &position) ||
-                !read_uint32(&state, &length, &position)) {
-                return false;
+        // each frame may consist of several fragments, so we need to scan each fragment to find the next frame


I don't think you need this complicated loop. I think (I think! I hope!) all you need to do is scan num_frames items and collect the offsets. This will collect the right number of frames whether or not frames are split into many items.

How about changing this back to:

turn false; } } else { // the BOT is missing, we must scan pixeldata to find the position of // each frame dcm_log_info("building Offset Table from Pixel Data"); // 0 in the BOT is the offset to the start of frame 1, ie. here *first_frame_offset = position; position = 0; for (int i = 0; i < num_frames; i++) { if (!read_tag(&state, &tag, &position) || !read_uint32(&state, &length, &position)) { return false; } if (tag == TAG_SQ_DELIM) { dcm_error_set(error, DCM_ERROR_CODE_PARSE, "reading BasicOffsetTable failed", "too few frames in PixelData"); return false; } if (tag != TAG_ITEM) { dcm_error_set(error, DCM_ERROR_CODE_PARSE, "building BasicOffsetTable failed", "frame Item #%d has wrong tag '%08x'", i + 1, tag); return false; } // step back to the start of the item for this frame offsets[i] = position - 8; // and seek forward over the value if (!dcm_seekcur(&state, length, &position)) { return false; } } } return true; }

ie. just removing the check for the end of sequence tag.

I suppose we could check for end-of-sequence in the num_frames > 1 case, but I don't know if it'd add much.

with a small addition this (original) code seems to work fine

src/dicom-parse.c

jcupitt · 2025-11-08T14:31:43Z

I'm still worried by the BOT builder, I think you don't need the new loop you made.

How about just reading num_frames tags, exactly as before. If num_frames is 1, then you'll just have the distance to the first frame. If it's more than one, then your assumption of one frame == one item will work fine. All you need to do is skip the "nest tag is delim" check we had at the end for the num_frames == 1 case.

One optimisation we could do would be to change this around to scan all items rather than scan all frames. We could record total frame length somewhere, then dcm_parse_encapsulated_frame() wouldn't need to make two passes over the data, it could just use the frame length we found in the BOT build. That's probably for a later PR though.

The new code isn't following the libdicom layout style exactly, but I can fix that after merge if you like (or do change it yourself, of course).

Otherwise this looks good!

jcupitt · 2025-11-08T14:33:38Z

src/dicom-parse.c

+            return NULL;
+        }
+        dcm_seekcur(&state, fragment_length, &position);
+        *length += fragment_length;


I suppose there's a potential uint32 overflow here. length should ideally be a uint64, though I don't suppose it matters much.

Theoretically it could happen. I'll use uint64_inside and add some range checking.

weanti · 2025-11-11T22:38:18Z

The new code isn't following the libdicom layout style exactly, but I can fix that after merge if you like (or do change it yourself, of course).

I tried to follow it. If you could adapt it for me that would be great. I'll check it and see what are the remaining style issues.

Otherwise this looks good!

avoid overflow for length value in dcm_parse_encapsulated_frame

restore some error handling

jcupitt

Just tiny things now.

src/dicom-file.c

src/dicom-parse.c

tests/check_dicom.c

src/pdicom.h

src/dicom-parse.c

jcupitt · 2025-11-12T14:12:04Z

src/dicom-parse.c

+*/
+char *dcm_parse_encapsulated_frame(DcmError **error,
+				   DcmIO *io,
+				   bool implicit,


Final thing! This indent still looks funky.

jcupitt · 2025-11-19T10:48:39Z

Sorry for the delay, and thank you for adding this useful feature!

weanti · 2025-11-19T12:14:57Z

Sorry for the delay, and thank you for adding this useful feature!

No problem. Thank you for the review, and all the work you put in this excellent library.

Antal Ispanovity added 3 commits June 26, 2025 15:03

remove restriction for bits stored, because its value can be modality…

87c0436

… specific e.g. in case of CT it can be 12,13,14,15,16

fix offset reading and frame reading for encapsulated pixel data

835ed96

fix offset calculation for encapsualted fragments: length was essenti…

d190231

…ally not read, undefined length was assumed use more intuitive offset calculation: previous offset + tag and length size + current length

jcupitt marked this pull request as draft June 30, 2025 15:01

split regular and encapsulated frame parsing

0ceb9cc

fix indexing problem when creating offsets table fix frame size calculation: bits allocated was not considered

weanti marked this pull request as ready for review July 3, 2025 04:18

fix offset calculation for non-encapsulated frames: bits allocated wa…

c02f192

…s not considered

Merge branch 'main' into main

7a08c78

jcupitt reviewed Oct 1, 2025

View reviewed changes

src/dicom-file.c Outdated Show resolved Hide resolved

src/dicom-parse.c Outdated Show resolved Hide resolved

src/dicom-parse.c Outdated Show resolved Hide resolved

src/dicom-parse.c Outdated Show resolved Hide resolved

src/dicom-parse.c Outdated Show resolved Hide resolved

coding style, code documentation, review fixes: simplification, error…

ab8b9f0

… handling, etc

functional tests for encapsulated pixel data handling

ac71ff4

jcupitt reviewed Nov 8, 2025

View reviewed changes

weanti added 2 commits November 11, 2025 23:41

simplify BOT construction

2c33aa4

avoid overflow for length value in dcm_parse_encapsulated_frame

simplify first BOT offset calculation

9cb222d

restore some error handling

jcupitt reviewed Nov 12, 2025

View reviewed changes

review fixesL coding style, comments, error messages

3fd0390

jcupitt reviewed Nov 12, 2025

View reviewed changes

indentation

9950851

jcupitt merged commit a4c4e40 into ImagingDataCommons:main Nov 19, 2025
6 checks passed

Fix encapsulated pixeldata handling #103

Fix encapsulated pixeldata handling #103

Uh oh!

Conversation

weanti commented Jun 26, 2025

Uh oh!

jcupitt commented Jun 30, 2025

Uh oh!

weanti commented Jun 30, 2025

Uh oh!

jcupitt commented Jun 30, 2025

Uh oh!

weanti commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

weanti commented Aug 27, 2025

Uh oh!

jcupitt commented Aug 27, 2025

Uh oh!

jcupitt commented Aug 27, 2025

Uh oh!

weanti commented Aug 27, 2025

Uh oh!

weanti commented Sep 30, 2025

Uh oh!

jcupitt commented Sep 30, 2025

Uh oh!

jcupitt commented Oct 1, 2025

Uh oh!

weanti commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcupitt commented Oct 1, 2025

Uh oh!

fedorov commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dclunie commented Oct 2, 2025

Uh oh!

michaelonken commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

weanti commented Oct 6, 2025

Uh oh!

weanti commented Oct 9, 2025

Uh oh!

weanti commented Nov 7, 2025

Uh oh!

jcupitt commented Nov 7, 2025

Uh oh!

jcupitt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jcupitt Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weanti Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jcupitt commented Nov 8, 2025

Uh oh!

jcupitt Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

weanti Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

weanti commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcupitt left a comment

Choose a reason for hiding this comment

Uh oh!

weanti commented Jul 1, 2025 •

edited

Loading

fedorov commented Oct 2, 2025 •

edited

Loading

michaelonken commented Oct 3, 2025 •

edited

Loading

jcupitt Nov 8, 2025 •

edited

Loading

weanti commented Nov 11, 2025 •

edited

Loading

jcupitt Nov 12, 2025 •

edited

Loading